Skip to content

Agent API: JSON actions, observations, phase flow#1

Open
JackSwitzer wants to merge 1 commit intomainfrom
work/agent-api
Open

Agent API: JSON actions, observations, phase flow#1
JackSwitzer wants to merge 1 commit intomainfrom
work/agent-api

Conversation

@JackSwitzer
Copy link
Owner

@JackSwitzer JackSwitzer commented Feb 4, 2026

Summary

  • Implement get_available_action_dicts() returning JSON actions
  • Implement take_action_dict() with validation
  • Implement get_observation() with full schema
  • Add phase transition validation
  • Expose map with nodes/edges/available_paths

Test Results

31 tests passing in test_agent_api.py

Files Changed

  • packages/engine/agent_api.py (NEW, ~1100 lines)
  • tests/test_agent_api.py (NEW, ~560 lines)

🤖 Generated with Claude Code


Note

Medium Risk
Adds a large, side-effecting agent_api module that monkey-patches GameRunner and introduces new code paths for phase transitions (e.g., boss relic skipping/treasure leaving). While mostly additive, mistakes could affect run flow or determinism for agent integrations.

Overview
Adds a new agent_api surface that exposes GameRunner.get_available_action_dicts(), GameRunner.take_action_dict(), and GameRunner.get_observation() for RL agents using JSON-serializable actions/observations across phases (Neow, map navigation, combat, rewards, events, shop, rest, treasure, boss rewards).

take_action_dict() maps action dicts into existing engine actions and includes explicit handling for special transitions like skip_boss_relic and leave_treasure. The engine package now auto-imports agent_api (auto-patching GameRunner) and exports the new TypedDict types.

Adds a comprehensive tests/test_agent_api.py suite covering action generation/execution, observation schema/JSON-serializability, phase transitions, and determinism, and updates tests/conftest.py to use a repo-relative path setup.

Written by Cursor Bugbot for commit 61024cf. This will update automatically on new commits. Configure here.

Add JSON-serializable action and observation interfaces for agents:

- get_available_action_dicts(): Returns list of ActionDict with id, type, label, params, phase
- take_action_dict(action): Executes action dict and returns ActionResult
- get_observation(): Returns complete observable game state as ObservationDict

Action types implemented:
- Combat: play_card, use_potion, end_turn
- Map: path_choice
- Events: event_choice, neow_choice
- Rewards: pick_card, skip_card, singing_bowl, claim_gold/potion/relic, etc.
- Shop: buy_card, buy_relic, buy_potion, remove_card, leave_shop
- Rest: rest, smith, dig, lift, toke, recall
- Treasure: take_relic, sapphire_key, leave_treasure
- Boss: pick_boss_relic, skip_boss_relic

Observation includes:
- run: seed, ascension, act, floor, gold, hp, deck, relics, potions, keys
- map: nodes, edges, available_paths, visited_nodes
- combat: player, energy, stance, hand, draw_pile, discard_pile, enemies
- event: event_id, phase, choices
- reward: gold, potion, card_rewards, relic, boss_relics
- shop: colored_cards, colorless_cards, relics, potions, purge_cost
- rest: available_actions

Also fixes conftest.py to use relative paths for worktree compatibility.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

"label": "Leave",
"params": {},
"phase": "treasure",
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agent API adds invalid leave_treasure action not in game

Medium Severity

The generate_treasure_actions function adds a leave_treasure action that doesn't exist in the original game logic. In Slay the Spire, treasure chests require either taking the relic or (in Act 3) taking the sapphire key - you cannot simply leave without taking anything. The original _get_treasure_actions() in game.py only provides take_relic and sapphire_key options. This introduces non-standard game behavior that could lead to RL agents learning invalid strategies.

Additional Locations (1)

Fix in Cursor Fix in Web

"label": "Skip boss relic",
"params": {},
"phase": "boss_reward",
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agent API adds invalid skip_boss_relic action not in game

Medium Severity

The generate_boss_reward_actions function adds a skip_boss_relic action that doesn't exist in the original game. In Slay the Spire, after defeating a boss you must choose one of the three offered boss relics - skipping is not an option. The original _get_boss_reward_actions() in game.py only returns BossRewardAction(i) for each relic without a skip option. This introduces non-standard game behavior.

Additional Locations (1)

Fix in Cursor Fix in Web

available.append("lift")

if runner.run_state.has_relic("Peace Pipe"):
available.append("toke")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest observation says toke available without checking removable cards

Low Severity

The generate_rest_observation function adds "toke" to available_actions when the player has the Peace Pipe relic, without checking if there are actually removable cards. However, generate_rest_actions only generates toke actions if get_removable_cards() returns cards. This inconsistency means the observation might indicate "toke" is available when no actual toke actions exist. While decks are rarely empty in practice, this creates an inconsistent API contract.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant